variational and ensemble method
Review for NeurIPS paper: Learning under Model Misspecification: Applications to Variational and Ensemble methods
Summary and Contributions: POST REBUTTAL: I was happy to see that all important issues I raised were addressed to my satisfaction by the rebuttal. Having reviewed an earlier version of this paper for ICML, I know that the authors will keep the promises made in their rebuttal. One minor issue that the rebuttal addresses incorrectly are the assumptions on loss functions (as they relate to PAC-Bayesian bounds). The claim in the rebuttal that the results hold for'general unbounded losses' is incorrect. In particular, the probability's exponentially fast convergence towards zero is *not* guaranteed for general unbounded losses, and this is actually made rather clear in the references you provide [17,45].
Review for NeurIPS paper: Learning under Model Misspecification: Applications to Variational and Ensemble methods
In this paper, the authors analyze the problem of poor performance of Bayesian model averaging under model misspecification using new second-order PAC-Bayes bounds. And they use their tools to suggest an alternative model to handle misspecification. Three of four reviewers are excited about the novelty of the approach and its important practical contribution. That being said, the authors need to be careful to follow up on their promises (in the author feedback) for clarification throughout the paper. And the authors need to make sure to correct the error in the author feedback flagged by Reviewer 1 (please see revised review).
Learning under Model Misspecification: Applications to Variational and Ensemble methods
Virtually any model we use in machine learning to make predictions does not perfectly represent reality. So, most of the learning happens under model misspecification. In this work, we present a novel analysis of the generalization performance of Bayesian model averaging under model misspecification and i.i.d. This analysis shows, in simple and intuitive terms, that Bayesian model averaging provides suboptimal generalization performance when the model is misspecified. In consequence, we provide strong theoretical arguments showing that Bayesian methods are not optimal for learning predictive models, unless the model class is perfectly specified.